perf: remove loop from `str::floor_char_boundary` #149466

overlookmotel · 2025-11-30T00:47:11Z

#144472 made str::floor_char_boundary a const function, but in doing so introduced a loop. This is unnecessary because the next UTF-8 character boundary can only be within a 4-byte range.

This produces excessive code for e.g. str.floor_char_boundary(20), because the loop is unrolled.

https://godbolt.org/z/5f3YsM6oK

This PR replaces the loop with 3 checks in series.

In addition to reducing code size in some cases, it also removes bounds checks from calling code when following floor_char_boundary with a call to String::truncate (which I assume might be a common pattern).

Notes

I tried using index.unchecked_sub(1), but found it doesn't remove bounds checks, whereas index.checked_sub(1).unwrap_unchecked() does. Surprising!
The assert_uncheckeds are required to elide checks from code following the floor_char_boundary call e.g.:

let index = string.floor_char_boundary(20);
string.truncate(index);

If this PR is accepted, I'll follow up with a similar PR for ceil_char_boundary.

Very happy to receive feedback and amend.

rustbot · 2025-11-30T00:47:15Z

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

clarfonthey · 2025-11-30T06:09:16Z

Rather than manually unrolling the code here, it might be worth considering if the original implementation can be made const instead:

rust/library/core/src/str/mod.rs

Lines 244 to 247 in 03b17b1

    
           let lower_bound = index.saturating_sub(3); 
        
           let new_index = self.as_bytes()[lower_bound..=index] 
        
               .iter() 
        
               .rposition(|b| b.is_utf8_char_boundary());

I had this exact thought in the original implementation, which is why I explicitly only checked the four following bytes for a character boundary. Note that I haven't fully scrutinised the code and yours may be better regardless, but I figured I'd mention this for some extra context. I would expect LLVM to automatically unroll a 4-max-iteration loop in most contexts.

quaternic · 2025-11-30T08:13:19Z

library/core/src/str/mod.rs

+        let bytes = self.as_bytes();
+        let boundary_index = if bytes[index].is_utf8_char_boundary() {
+            index
        } else {
-            let mut i = index;
-            while i > 0 {
-                if self.as_bytes()[i].is_utf8_char_boundary() {
-                    break;
+            // SAFETY: `bytes[index]` is a UTF-8 continuation byte, therefore there must be a byte before it.
+            // Note: `index.unchecked_sub(1)` would be preferable, but it doesn't the remove bounds check
+            // from `bytes[previous_index]`.
+            let previous_index = unsafe { index.checked_sub(1).unwrap_unchecked() };
+            if bytes[previous_index].is_utf8_char_boundary() {
+                previous_index
+            } else {
+                // SAFETY: `bytes[index - 1]` is a UTF-8 continuation byte, therefore there must be a byte before it
+                let previous_previous_index = unsafe { index.checked_sub(2).unwrap_unchecked() };
+                if bytes[previous_previous_index].is_utf8_char_boundary() {
+                    previous_previous_index
+                } else {
+                    // UTF-8 character sequences are at most 4 bytes long.
+                    // `bytes[index - 2]`, `bytes[index - 1]`, and `bytes[index]` are all continuation bytes,
+                    // therefore `bytes[index - 3]` must be the 1st byte in a 4-byte UTF-8 character sequence.
+                    debug_assert!(bytes[index - 3].is_utf8_char_boundary());
+                    index - 3
                }
-                i -= 1;
            }
+        };


The deep nesting isn't ideal. You could try matching slice patterns:

let bytes = &self.as_bytes()[..=index]; let boundary_index = match bytes { [.., b] if b.is_utf8_char_boundary() => index, [.., b, _] if b.is_utf8_char_boundary() => index - 1, [.., b, _, _] if b.is_utf8_char_boundary() => index - 2, [.., b, _, _, _] if b.is_utf8_char_boundary() => index - 3, // SAFETY: TODO _ => unsafe { debug_assert!(bytes[index.saturating_sub(3)].is_utf8_char_boundary()); unreachable_unchecked() } };

This is much nicer. I'll check if it optimizes equally as well.

okaneco · 2025-11-30T19:01:28Z

You might want to add a codegen test under tests/codegen-llvm to make sure the panics/slice fail checks that you're targeting are optimized out.

At the time, I tried to add a counter variable to limit the number of iterations but I wasn't able to remove the panics like the current implementation does.

slice_new and truncate_new still have panic paths when the index is variable. Variable index slice doesn't have a panic in the current implementation. Variable truncate has a panic for both implementations.
https://godbolt.org/z/T1TexYbE7

I wrote an example codegen test to help with testing this out (others may be able to improve it further).
You can run the tests with ./x test [path to file or directory] (save time by iterating over the IR/assembly in godbolt first).
Again, the variable truncate may not be possible to enable currently.
https://rustc-dev-guide.rust-lang.org/tests/compiletest.html#codegen-tests

Example codegen test file

// Test that panics and slice errors are elided from `floor_char_boundary` implementation.

//@ compile-flags: -Copt-level=3

#![crate_type = "lib"]

// CHECK-LABEL: @floor_fixed_index
#[no_mangle]
pub fn floor_fixed_index(s: &str) -> usize {
    // CHECK-NOT: panic
    s.floor_char_boundary(20)
}

// CHECK-LABEL: @floor_variable_index
#[no_mangle]
pub fn floor_variable_index(s: &str, i: usize) -> usize {
    // CHECK-NOT: panic
    s.floor_char_boundary(i)
}

// CHECK-LABEL: @floor_slice_fixed_index
#[no_mangle]
pub fn floor_slice_fixed_index(s: &str) -> &str {
    // CHECK-NOT: panic
    // CHECK-NOT: slice_error_fail
    let index = s.floor_char_boundary(20);
    &s[..index]
}

// CHECK-LABEL: @floor_slice_variable_index
#[no_mangle]
pub fn floor_slice_variable_index(s: &str, i: usize) -> &str {
    // CHECK-NOT: panic
    // CHECK-NOT: slice_error_fail
    let index = s.floor_char_boundary(i);
    &s[..index]
}

// CHECK-LABEL: @truncate_from_floor_fixed_index
#[no_mangle]
pub fn truncate_from_floor_fixed_index(s: &mut String) {
    // CHECK-NOT: panic
    let index = s.floor_char_boundary_new(20);
    s.truncate(index)
}

// CHECK-LABEL: @truncate_from_floor_variable_index
#[no_mangle]
pub fn truncate_from_floor_variable_index(s: &mut String, i: usize) {
    // CHECK-NOT: panic
    let index = s.floor_char_boundary_new(i);
    s.truncate(index)
}

overlookmotel · 2025-11-30T19:05:10Z

slice_new and truncate_new still have panic paths when the index is variable. Variable index slice doesn't have a panic in the current implementation. Variable truncate has a panic for both implementations.

Yes, I noticed this today. Am hoping to eliminate all these panics. I feel like it should be possible, but... mysteries of the compiler...

perf: remove loop from str::floor_char_boundary

3c96dfa

rustbot assigned scottmcm Nov 30, 2025

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 30, 2025

quaternic reviewed Nov 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: remove loop from `str::floor_char_boundary` #149466

perf: remove loop from `str::floor_char_boundary` #149466

overlookmotel commented Nov 30, 2025 •

edited

Loading

Uh oh!

rustbot commented Nov 30, 2025

Uh oh!

clarfonthey commented Nov 30, 2025 •

edited

Loading

Uh oh!

quaternic Nov 30, 2025

Uh oh!

overlookmotel Nov 30, 2025

Uh oh!

okaneco commented Nov 30, 2025 •

edited

Loading

Uh oh!

overlookmotel commented Nov 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

perf: remove loop from str::floor_char_boundary #149466

Are you sure you want to change the base?

perf: remove loop from str::floor_char_boundary #149466

Conversation

overlookmotel commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes

Uh oh!

rustbot commented Nov 30, 2025

Uh oh!

clarfonthey commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

quaternic Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

overlookmotel Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

okaneco commented Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

overlookmotel commented Nov 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

perf: remove loop from `str::floor_char_boundary` #149466

perf: remove loop from `str::floor_char_boundary` #149466

overlookmotel commented Nov 30, 2025 •

edited

Loading

clarfonthey commented Nov 30, 2025 •

edited

Loading

okaneco commented Nov 30, 2025 •

edited

Loading